Picture for Ping Nie

Ping Nie

Context Forcing: Consistent Autoregressive Video Generation with Long Context

Add code
Feb 05, 2026
Viaarxiv icon

GraphDancer: Training LLMs to Explore and Reason over Graphs via Curriculum Reinforcement Learning

Add code
Jan 24, 2026
Viaarxiv icon

Beyond Single-shot Writing: Deep Research Agents are Unreliable at Multi-turn Report Revision

Add code
Jan 19, 2026
Viaarxiv icon

A Rigorous Benchmark with Multidimensional Evaluation for Deep Research Agents: From Answers to Reports

Add code
Oct 02, 2025
Viaarxiv icon

VisCoder: Fine-Tuning LLMs for Executable Python Visualization Code Generation

Add code
Jun 04, 2025
Viaarxiv icon

StructEval: Benchmarking LLMs' Capabilities to Generate Structural Outputs

Add code
May 26, 2025
Figure 1 for StructEval: Benchmarking LLMs' Capabilities to Generate Structural Outputs
Figure 2 for StructEval: Benchmarking LLMs' Capabilities to Generate Structural Outputs
Figure 3 for StructEval: Benchmarking LLMs' Capabilities to Generate Structural Outputs
Figure 4 for StructEval: Benchmarking LLMs' Capabilities to Generate Structural Outputs
Viaarxiv icon

Likert or Not: LLM Absolute Relevance Judgments on Fine-Grained Ordinal Scales

Add code
May 25, 2025
Viaarxiv icon

VideoEval-Pro: Robust and Realistic Long Video Understanding Evaluation

Add code
May 20, 2025
Viaarxiv icon

MoE-CAP: Benchmarking Cost, Accuracy and Performance of Sparse Mixture-of-Experts Systems

Add code
May 16, 2025
Viaarxiv icon

Breaking the Batch Barrier (B3) of Contrastive Learning via Smart Batch Mining

Add code
May 16, 2025
Viaarxiv icon